Women Writers Project

Iker, Yidi, & Yu

Infographic

Image Name: Books

Motivation

The Women Writers Project is a program dedicated to collecting and promoting writings from women between the 16th and 19th centuries. The group aims to provide an inclusive and diverse perspective on the historical and literary landscape by including works from women of different races, cultures, and social backgrounds. The ultimate goal is to make these writings available to the public, especially for research purposes by college students. As part of this project, our team is going to be working with the Women Writers Project with the purpose of creating visualizations that represent various aspects of the collected writings. These visualizations showcase the genres, publication dates & locations, authors, genders, and citations of the works. By presenting these trends and patterns through visualizations, we hope to communicate important insights to the public and academic communities. Our work also aligns with the Women Writers Project's mission to promote and preserve historical texts and highlight the contributions of women writers throughout history. This project serves as an important reminder of the significance of women's voices in literature and history. Through our visualizations, we aim to inspire and facilitate further research and appreciation of these important works, while at the same time promoting these writings. Our team decided to focus on the Women Writers Project because we are going to be creating visualizations that represent the writings that were done between the 16th and 19th centuries. It was during this time that seen women in education-related activities were seen as something weird. The roles of each genre were clearly established, but this did not stop women to make these excellent writings. Our motivations come from observing this bravery from women in order to break these social norms.

Data

The data we are working with was collected by the Women Writers Project, and we have had the opportunity to meet with members of the organization several times to clarify the topics we will be focusing on, which include genres such as classical drama, novels, poetry, and sacred texts. Having direct communication with our partners has enabled us to ask data-related questions and gain a better understanding of the dataset. We are truly grateful to our partner because they have given us feedback on our creations as we have worked collaboratively with them in order to create a better final product. One advantage of these datasets is that they do not contain any missing values since the Women Writers Project strives to preserve the value of the writings. However, the datasets are in a JSON format based on genres, which none of us have used before. To visualize the data in a more familiar format, we used Tableau, which allowed us to better understand the data. The datasets contain information about works and their references within other works, including a unique identifier for the work, a list of referenced works with their unique identifiers, the number of times they are referenced within the work, and their full titles. Additionally, the datasets include the total number of gestures made in the work, the different types of gestures made (and their counts), a description of the work, and its publication date. Notably, all counts in the datasets are greater than or equal to one, indicating that our partner only collected data from the works being referenced. While the data was already clean and we did not have to worry about missing values, the JSON format was a challenge. But in order to solve this issue, we used Tableau to visualize the data in a spreadsheet format, which helped us better understand the data types within the datasets. The different types of data are ordinal for the publication date, quantitative for total gestures and counts, and categorical for all other objects. During a meeting with our partner, we faced challenges in identifying the appropriate final product and dataset for our project. Fortunately, our partner supplied us with a bibliography dataset, which enabled us to develop our second final product, a Sankey visualization showcasing the relationship between author gender and location. Our analysis specifically focused on exploring the link between the two variables.

Task Analysis

5x4 Table Example
Domain Task Analytic Task (Low-level, “Query”) Search Task (Mid-level) Analyze Task (High-level)
Throughout all the publications, find out the work that has more than 1 gesture. Retrieve value, Filter Locate Discover, Present
Find the different publications throughout time and count them. Filter Browse Discover, Present
Find the most popular genre of women’s work in the early modern era. Sort, Find Extremum Locate Discover

Data Analysis

A very interesting thing that we discovered from analyzing the first visualizations is that in the decade of 1800, there is a big drop in the count of books published, and after this, the count returned to its normal level in the decade of 1810. One of our theories to explain this significant drop in the count of publications is that during this decade the Napoleonic War began, and they were fought throughout Europe as well as in Egypt and in the Caribbean. This resulted in women occupying their time in helping these wars in whatever way they could, and they did not have any time to publish. I believe that this is a very interesting finding because it reveals how significant historic events affected the number of publications by women. Our second visualization yielded intriguing findings despite our focus on the Women Writers Project, which includes male authors. We aimed to identify the location of all writers in our dataset and discovered that most male writers hail from London, England, and have more diverse locations attributed to their gender. Meanwhile, most female writers are from London, with a few exceptions. Given our focus on writing from the 16th to 19th centuries, it's expected that many writing locations belong to major cities at the time. However, the most interesting aspect of this graph is how the majority of female writing occurred in England, with very few in other locations. In contrast, male writers are more dispersed across different locations, although a majority are still in England.

Design Process

To begin our brainstorming process, we generated approximately 10 sketches on paper. From these, we ultimately selected two visualizations: a comprehensive grouped bar chart and a Sankey diagram. Seeking advice from our professor, we integrated interactive features into a single plot. We then created digital sketches using UI tools to facilitate communication with our data provider, the Woman's Writer Project, halfway through the project. They provided guidance on improving the accessibility of color encoding. Additionally, we adjusted the measurements of publication date to capture overall trends within the dataset. Finally, we refined our visualizations using various Python packages to present these two visualizations in the most polished and effective way possible.

Data Visualization(s)

First Visualization

We choose grouped bar charts over stacked bar charts and line charts because we would like to display different interactive features all at once, including the pop-up table for description. The line chart is not clear and not intuitive for the table. It is not straightforward for a stacked bar chart to compare the count for works of a certain genre. The chart visualizes the count of documents belonging to different genres published over time. Each bar represents a genre and is color-coded based on the genre. The height of each bar represents the count of documents belonging to the genre, and the x-axis represents the different genres. The animation feature allows the user to see how the counts of documents change over time. The chart also includes hover data that displays additional information about each bar, such as the description of the documents belonging to the genre.

Second Visualization

We planned to generate a network at first, but most of the work only got cited once, so we think the final network will not be very meaningful. The Sankey diagram below is our assignment for another class. Since we have worked in the past with this type of visualization, we are more confident about doing it again. Specifically this time, we are going to be using the bibliography data set, and we are going to be linking the gender of the author to the location where the book was published. The particular stage of the process was crucial for us as we faced difficulty in determining our ultimate product. Without this step, we wouldn't have achieved the final outcomes we have today. Another key aspect of our project was the consistent feedback from our professor and teaching assistants, which provided constructive suggestions for enhancing the project. These inputs enabled us to identify areas where we could make improvements.